Music Structure Discovery in Popular Music using Non-negative Matrix Factorization

نویسندگان

  • Florian Kaiser
  • Thomas Sikora
چکیده

We introduce a method for the automatic extraction of musical structures in popular music. The proposed algorithm uses non-negative matrix factorization to segment regions of acoustically similar frames in a self-similarity matrix of the audio data. We show that over the dimensions of the NMF decomposition, structural parts can easily be modeled. Based on that observation, we introduce a clustering algorithm that can explain the structure of the whole music piece. The preliminary evaluation we report in the the paper shows very encouraging results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vocalist Gender Recognition in Recorded Popular Music

We introduce the task of vocalist gender recognition in popular music and evaluate the benefit of Non-Negative Matrix Factorization based enhancement of melodic components to this aim. The underlying automatic separation of drum beats is described in detail, and the obtained significant gain by its use is verified in extensive test-runs on a novel database of 1.5 days of MP3 coded popular songs...

متن کامل

Constrained Nonnegative Matrix Factorization with Applications to Music Transcription

We apply nonnegative matrix factorization to the task of music transcription. In music transcription we are given an audio recording of a musical piece and attempt to find the underlying sheet music which generated the music. We improve upon current transcription results by imposing novel temporal and sparsity constraints which exploit the structure of music. We demonstrate the effectiveness or...

متن کامل

Automatic Music Transcription based on Non-Negative Matrix Factorization

In this paper, we present a method for the automatic transcription of polyphonic piano music. The input to this method consists in piano music recordings stored in WAV files, while the pitch of all the notes in the corresponding score forms the output. This method operates on a frame-by-frame basis and exploits a suitable time-frequency representation of the audio signal. The solution proposed ...

متن کامل

Speech Recognition in Mixed Sound of Speech and Music Based on Vector Quantization and Non-Negative Matrix Factorization

This paper describes a speech recognition method for mixed sound, consisting of speech and music, that removes the music only based on vector quantization (VQ) and non-negative matrix factorization (NMF). For isolated word recognition using the clean speech model, an improvement of about 15% was obtained compared with the case of not removing music. Furthermore, a high recognition rate of about...

متن کامل

Speech recognition based on Itakura-Saito divergence and dynamics/sparseness constraints from mixed sound of speech and music by non-negative matrix factorization

We considered a speech recognition method for mixed sound, which is composed of both speech and music, that only removes music based on non-negative matrix factorization (NMF). We used Itakura-Saito divergence instead of Kullback-Leibler divergence to compare the cost function, and the dynamics and sparseness constraints of a weight matrix to improve speech recognition. For isolated word recogn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010